6 research outputs found
Dual-Branch Reconstruction Network for Industrial Anomaly Detection with RGB-D Data
Unsupervised anomaly detection methods are at the forefront of industrial
anomaly detection efforts and have made notable progress. Previous work
primarily used 2D information as input, but multi-modal industrial anomaly
detection based on 3D point clouds and RGB images is just beginning to emerge.
The regular approach involves utilizing large pre-trained models for feature
representation and storing them in memory banks. However, the above methods
require a longer inference time and higher memory usage, which cannot meet the
real-time requirements of the industry. To overcome these issues, we propose a
lightweight dual-branch reconstruction network(DBRN) based on RGB-D input,
learning the decision boundary between normal and abnormal examples. The
requirement for alignment between the two modalities is eliminated by using
depth maps instead of point cloud input. Furthermore, we introduce an
importance scoring module in the discriminative network to assist in fusing
features from these two modalities, thereby obtaining a comprehensive
discriminative result. DBRN achieves 92.8% AUROC with high inference efficiency
on the MVTec 3D-AD dataset without large pre-trained models and memory banks.Comment: 8 pages, 5 figure
CL-Flow:Strengthening the Normalizing Flows by Contrastive Learning for Better Anomaly Detection
In the anomaly detection field, the scarcity of anomalous samples has
directed the current research emphasis towards unsupervised anomaly detection.
While these unsupervised anomaly detection methods offer convenience, they also
overlook the crucial prior information embedded within anomalous samples.
Moreover, among numerous deep learning methods, supervised methods generally
exhibit superior performance compared to unsupervised methods. Considering the
reasons mentioned above, we propose a self-supervised anomaly detection
approach that combines contrastive learning with 2D-Flow to achieve more
precise detection outcomes and expedited inference processes. On one hand, we
introduce a novel approach to anomaly synthesis, yielding anomalous samples in
accordance with authentic industrial scenarios, alongside their surrogate
annotations. On the other hand, having obtained a substantial number of
anomalous samples, we enhance the 2D-Flow framework by incorporating
contrastive learning, leveraging diverse proxy tasks to fine-tune the network.
Our approach enables the network to learn more precise mapping relationships
from self-generated labels while retaining the lightweight characteristics of
the 2D-Flow. Compared to mainstream unsupervised approaches, our
self-supervised method demonstrates superior detection accuracy, fewer
additional model parameters, and faster inference speed. Furthermore, the
entire training and inference process is end-to-end. Our approach showcases new
state-of-the-art results, achieving a performance of 99.6\% in image-level
AUROC on the MVTecAD dataset and 96.8\% in image-level AUROC on the BTAD
dataset.Comment: 6 pages,6 figure
A Precisely One-Step Registration Methodology for Optical Imagery and LiDAR Data Using Virtual Point Primitives
The registration of optical imagery and 3D Light Detection and Ranging (LiDAR) point data continues to be a challenge for various applications in photogrammetry and remote sensing. In this paper, the framework employs a new registration primitive called virtual point (VP) that can be generated from the linear features within a LiDAR dataset including straight lines (SL) and curved lines (CL). By using an auxiliary parameter (λ), it is easy to take advantage of the accurate and fast calculation of the one-step registration transformation model. The transformation model parameters and λs can be calculated simultaneously by applying the least square method recursively. In urban areas, there are many buildings with different shapes. Therefore, the boundaries of buildings provide a large number of SL and CL features and selecting properly linear features and transforming into VPs can reduce the errors caused by the semi-discrete random characteristics of the LiDAR points. According to the result shown in the paper, the registration precision can reach the 1~2 pixels level of the optical images
A Precisely One-Step Registration Methodology for Optical Imagery and LiDAR Data Using Virtual Point Primitives
The registration of optical imagery and 3D Light Detection and Ranging (LiDAR) point data continues to be a challenge for various applications in photogrammetry and remote sensing. In this paper, the framework employs a new registration primitive called virtual point (VP) that can be generated from the linear features within a LiDAR dataset including straight lines (SL) and curved lines (CL). By using an auxiliary parameter (λ), it is easy to take advantage of the accurate and fast calculation of the one-step registration transformation model. The transformation model parameters and λs can be calculated simultaneously by applying the least square method recursively. In urban areas, there are many buildings with different shapes. Therefore, the boundaries of buildings provide a large number of SL and CL features and selecting properly linear features and transforming into VPs can reduce the errors caused by the semi-discrete random characteristics of the LiDAR points. According to the result shown in the paper, the registration precision can reach the 1~2 pixels level of the optical images
Extracting Urban Road Footprints from Airborne LiDAR Point Clouds with PointNet++ and Two-Step Post-Processing
In this paper, a novel framework for the automatic extraction of road footprints from airborne LiDAR point clouds in urban areas is proposed. The extraction process consisted of three phases: The first phase is to extract road points by using the deep learning model PointNet++, where the features of the input data include not only those selected from raw LiDAR points, such as 3D coordinate values, intensity, etc., but also the digital number (DN) of co-registered images and generated geometric features to describe a strip-like road. Then, the road points from PointNet++ were post-processed based on graph-cut and constrained triangulation irregular networks, where both the commission and omission errors were greatly reduced. Finally, collinearity and width similarity were proposed to estimate the connection probability of road segments, thereby improving the connectivity and completeness of the road network represented by centerlines. Experiments conducted on the Vaihingen data show that the proposed framework outperformed others in terms of completeness and correctness; in addition, some narrower residential streets with 2 m width, which have normally been neglected by previous studies, were extracted. The completeness and the correctness of the extracted road points were 84.7% and 79.7%, respectively, while the completeness and the correctness of the extracted centerlines were 97.0% and 86.3%, respectively